Custom identities #4764

galvana · 2024-03-30T22:46:22Z

Description Of Changes

This change adds the ability to use custom identities in the dataset identity references

collections:
  - name: loyalty
    fields:
      - name: id
        data_categories: [user.unique_id]
        fides_meta:
          identity: loyalty_id

Since the likelihood that a Fides instance will use multiple identities after this change (for example email + customer_id), I also made a few changes in our graph utils to support multiple identities for SaaS connectors.

Code Changes

Privacy Center

Updated the PrivacyRequestForm.tsx to be able to render the new custom identities

Admin UI

Updated the privacy request table and privacy request detail page to support the identity labels that are now returned from the API

Identity management

Removed the providedidentitytype constraint on providedidentity.field_name to allow any custom defined field name
Updated the Identity schema to allow extra fields as long as they have the LabeledIdentity type
Updated cache_identity/get_cached_identity_data and persist_identity/get_persisted_identity functions to support labeled identities

Task execution

Updated pre_process_input_data in graph_task.py to return unique output values (see inline comments)
Removed single identity constraint from SaaS connectors

Steps to Confirm

Add a custom identity to the access request in fides/data/sample_project/privacy_center/config/config.json

"identity_inputs": {
  "email": "required",
  "loyalty_id": { "label": "Loyalty ID" }
},

Run nox -s fides_env(test)
Navigate to Systems & Vendors > Cookie House Loyalty Program > Integrations and enable the integration
Navigate to the Privacy Center and submit the following access request

Email: [email protected]
Loyalty ID: CH-1

Navigate back to the Admin UI and approve the request
Verify the presence of data from the postgres_example_test_extended_dataset in the DSR package that is written to fides_uploads

Pre-Merge Checklist

All CI Pipelines Succeeded
Documentation:
- documentation complete, PR opened in fidesdocs
- documentation issue created in fidesdocs
Issue Requirements are Met
Relevant Follow-Up Issues Created
Update CHANGELOG.md
For API changes, the Postman collection has been updated

vercel · 2024-03-30T22:46:28Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
fides-plus-nightly	⬜️ Ignored (Inspect)	Visit Preview		Apr 10, 2024 5:32am

cypress · 2024-03-30T22:58:24Z

Passing run #7150 ↗︎

0	4	0	0	0
⚠️ You've recorded test results over your free plan limit. Upgrade your plan to view test results.

Details:

Merge `98f7c26` into `40cef1a`...
Project: fides	Commit: `7fbc19b0bc ℹ️`
Status: Passed	Duration: 00:37 💡
Started: Apr 10, 2024 5:43 AM	Ended: Apr 10, 2024 5:44 AM

Review all test suite changes for PR #4764 ↗︎

codecov · 2024-04-01T02:21:05Z

Codecov Report

Attention: Patch coverage is 92.37288% with 9 lines in your changes are missing coverage. Please review.

Project coverage is 86.61%. Comparing base (40cef1a) to head (4b448f0).

Files	Patch %	Lines
src/fides/api/models/privacy_request.py	91.89%	1 Missing and 2 partials ⚠️
src/fides/api/util/collection_util.py	86.36%	1 Missing and 2 partials ⚠️
src/fides/api/schemas/redis_cache.py	95.00%	1 Missing and 1 partial ⚠️
...rc/fides/api/service/connectors/fides_connector.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4764      +/-   ##
==========================================
- Coverage   86.63%   86.61%   -0.02%     
==========================================
  Files         339      339              
  Lines       20008    20078      +70     
  Branches     2556     2583      +27     
==========================================
+ Hits        17333    17391      +58     
- Misses       2206     2215       +9     
- Partials      469      472       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

adamsachs · 2024-04-08T14:19:17Z

making my way through a code review, but posting UAT testing results here per instructions on the PR (thank you for providing a great test setup!). things are looking good! happy path results:

also tested if an invalid custom value is identity value is provided, request still succeeds but gets no results in the extended dataset, as expected:

noting that the privacy center form validation is a little bit odd in that it allows you to click "continue" but then gives a validation error, as opposed to the other required fields (i.e. the 'standard' identity fields), which actually grey out/block the user from clicking "continue". but that seems acceptable to me, just wanted to point it out (i'm assuming this is expected!)

adamsachs

@galvana really nice work making this comprehensive update, and thank you for anticipating all the impacts this change may have, including with relaxing the constraint to allow >1 identity value!

most of my comments are relatively minor tweaks, and i'm able to generally follow the identity-management updates well - i think your approach for custom identity support there is clever and it seems pretty robust (a few things around the edges that may help to make it a bit more defensive/less prone to accidental error moving forward)!

in terms of the graph_task updates, i'm following the concrete functionality you've added, i just generally always have a bit of trouble wrapping my head around what pre_process_input_data does generally - it's a pretty weighty method! so i'm feeling a bit less confident in my analysis of those changes. nothing particularly stands out to me as problematic, but it may be good to sync up to understand a bit more concretely how those updates will play in with the overall workflow...

as mentioned above, UAT testing is looking good! so i'm about ready to approve this, perhaps you can just look over my comments and we can align on the graph_task updates, and then we should be good to push this through 👍

adamsachs · 2024-04-07T20:17:00Z

src/fides/api/schemas/redis_cache.py


 from fides.api.custom_types import PhoneNumber
 from fides.api.schemas.base_class import FidesSchema

+MultiValue = Union[Union[StrictInt, StrictStr], List[Union[StrictInt, StrictStr]]]


nit: i feel like there's a slightly more idiomatic way to do this - or at least it feels more readable to me. maybe you disagree, or i've got something wrong 😅

Suggested change

MultiValue = Union[Union[StrictInt, StrictStr], List[Union[StrictInt, StrictStr]]]

MultiValue = Union[StrictInt, StrictStr, List[Union[StrictInt, StrictStr]]]

I agree, I was struggling to get these types to work as expected so I completely looked past the double union 😆

adamsachs · 2024-04-07T20:20:40Z

src/fides/api/schemas/redis_cache.py

+    def __init__(self, **data: Any):
+        for field, value in data.items():
+            if field not in self.__fields__:
+                if isinstance(value, LabeledIdentity):
+                    data[field] = value
+                elif isinstance(value, dict) and "label" in value and "value" in value:
+                    data[field] = LabeledIdentity(**value)
+                else:
+                    raise ValueError(
+                        f'Custom identity "{field}" must be an instance of LabeledIdentity '
+                        '(e.g. {"label": "Field label", "value": "123"})'
+                    )
+        super().__init__(**data)


fancy!

nicely done to have this "validation" in the constructor to allow extra fields but still effectively constrain them 👍

adamsachs · 2024-04-07T20:21:56Z

src/fides/api/schemas/redis_cache.py

+    def dict(self, *args: Any, **kwargs: Any) -> Dict[str, Any]:
+        """
+        Returns a dictionary with LabeledIdentity values returned as simple values.
+        """
+        d = super().dict(*args, **kwargs)
+        for key, value in self.__dict__.items():
+            if isinstance(value, LabeledIdentity):
+                d[key] = value.value
+            else:
+                d[key] = value
+        return d


adamsachs · 2024-04-07T20:56:10Z

src/fides/api/service/privacy_request/dsr_package/templates/welcome.html

+               {% for identity_type, identity_data in request.identity.items() %}
+               <div>{{ identity_data.label }}:</div>
+               <div>{{ identity_data.value }}</div>
+               {% endfor %}


nice, really comprehensive updates to get this working smoothly across the whole stack!

adamsachs · 2024-04-07T21:00:38Z

src/fides/data/sample_project/postgres_sample.sql

+('CH-1', 'Jane Customer', 100, 'Cookie Rookie'),
+('CH-2', 'John Customer', 200, 'Cookie Connoisseur');


https://www.thecookierookie.com/ 💯

I just thought it sounded funny 😆 I didn't know about the site

hahaha i know i also thought it sounded great so i looked it up and was so happy with what i found

adamsachs · 2024-04-08T14:34:06Z

src/fides/api/models/privacy_request.py

+                if isinstance(value, dict):
+                    label = value["label"]
+                    value = value["value"]
+                else:
+                    label = None


may be worth making this a bit more defensive, or at least adding in some code comments? i know with the current code this is safe, but i get a little bit concerned about a direct dict lookup like this buried pretty deep in the code causing a tough-to-debug/predict problem later on. i guess the risk would be if we ever evolved to have a proper field on the Identity class with a dict type. but maybe just adding in some extra checks on this side too, similar to as you've done in the Identity constructor (i.e. and "label" in value and "value" in value:)? even throwing a more specific runtime error there to alert developers could help prevent a tricky debug later on...

Suggested change

if isinstance(value, dict):

label = value["label"]

value = value["value"]

else:

label = None

if isinstance(value, dict):

if "label" in value and "value" in value:

label = value["label"]

value = value["value"]

else:

raise RuntimeError(f"Programming error: unexpected dict value '{value}' found in an Identity's `labeled_dict()`!")

else:

label = None

Good suggestion!

adamsachs · 2024-04-08T14:56:01Z

src/fides/api/service/privacy_request/dsr_package/dsr_report_builder.py

+        for key, value in privacy_request.get_persisted_identity()
+        .labeled_dict(include_default_labels=True)
+        .items()
+        if value["value"] is not None


is this not a bit risky? i may be missing something, a bit hard for me to determine the possible values here!

Suggested change

if value["value"] is not None

if value.get("value") is not None

Same reasoning as the other ["value"] access, this should be a dict with label and value keys and want to error if that's not the case.

adamsachs · 2024-04-08T17:19:58Z

src/fides/api/util/collection_util.py

might be nice to get a bit of unit test coverage on these new functions specifically?

Added tests around the mutability functions

lovely! thank you, those look great. they also work as a form of documenting the functionality :)

adamsachs · 2024-04-08T17:25:16Z

src/fides/api/task/graph_task.py

-        return output
+                    output[FIDESOPS_GROUPED_INPUTS].add(make_immutable(grouped_data))
+
+        return make_mutable(output)


ok! this is a very helpful explanation. may be good to put a note about this functionality into the method docstring? (it's already a very explanatory docstring :) )

adamsachs · 2024-04-08T17:34:18Z

src/fides/api/util/collection_util.py

+def make_immutable(obj: Any) -> Any:
+    if isinstance(obj, dict):


docstrings here and on make_mutable would be nice!

Added docstrings

galvana

I addressed most of your comments except the one about adding more tests to the dict and labeled_dict functions. I'd like to know what test cases you had in mind.

galvana · 2024-04-08T18:07:32Z

src/fides/api/schemas/redis_cache.py


 from fides.api.custom_types import PhoneNumber
 from fides.api.schemas.base_class import FidesSchema

+MultiValue = Union[Union[StrictInt, StrictStr], List[Union[StrictInt, StrictStr]]]


I agree, I was struggling to get these types to work as expected so I completely looked past the double union 😆

galvana · 2024-04-08T18:12:30Z

src/fides/api/models/privacy_request.py

+                if isinstance(value, dict):
+                    label = value["label"]
+                    value = value["value"]
+                else:
+                    label = None


Good suggestion!

galvana · 2024-04-08T18:20:11Z

src/fides/api/models/privacy_request.py

        schema = Identity()
        for field in self.provided_identities:  # type: ignore[attr-defined]
+            value = field.encrypted_value.get("value")
+            if field.field_label:
+                value = LabeledIdentity(label=field.field_label, value=value)
            setattr(
                schema,
-                field.field_name.value,
-                field.encrypted_value["value"],
+                field.field_name,  # type:ignore
+                value,  # type:ignore
            )
        return schema


This is a good approach 👍

galvana · 2024-04-08T18:40:54Z

src/fides/api/service/privacy_request/dsr_package/dsr_report_builder.py

+        for key, value in privacy_request.get_persisted_identity()
+        .labeled_dict(include_default_labels=True)
+        .items()
+        if value["value"] is not None


Same reasoning as the other ["value"] access, this should be a dict with label and value keys and want to error if that's not the case.

galvana · 2024-04-08T18:46:32Z

src/fides/api/task/graph_task.py

-        return output
+                    output[FIDESOPS_GROUPED_INPUTS].add(make_immutable(grouped_data))
+
+        return make_mutable(output)


The output dictionary is constructed with deduplicated values for each key, ensuring that the value lists
and the fides_grouped_input list contain only unique elements.

galvana · 2024-04-08T18:57:15Z

src/fides/api/util/collection_util.py

+def make_immutable(obj: Any) -> Any:
+    if isinstance(obj, dict):


Added docstrings

galvana · 2024-04-08T19:03:02Z

src/fides/data/sample_project/postgres_sample.sql

+('CH-1', 'Jane Customer', 100, 'Cookie Rookie'),
+('CH-2', 'John Customer', 200, 'Cookie Connoisseur');


I just thought it sounded funny 😆 I didn't know about the site

galvana · 2024-04-08T19:03:55Z

src/fides/api/util/collection_util.py

Added tests around the mutability functions

adamsachs

looking great after the latest updates! discussed some potential further tweaks offline that may be nice, and also getting some FE code review, but this is looking good to go from my end 👍

jpople · 2024-04-09T19:23:39Z

clients/privacy-center/components/modals/privacy-request-modal/PrivacyRequestForm.tsx

+      // extract identity input values
+      const identityInputValues = Object.fromEntries(
+        Object.entries(action.identity_inputs ?? {}).map(([key, field]) => {
+          const value = values[key] || null;


Do we care about handling boolean and number values here? If values[key] is falsy (including false or 0), this syntax will overwrite that with null. If this is just trying to fall back if the value doesn't exist, I would prefer using ??.

Good catch, I added explicit checks for undefined and "" before falling back to null.

First pass of custom identities

8ceedea

galvana added 2 commits March 31, 2024 17:39

Fixing tests

303fdaf

Fixing migration

e4affb3

galvana added 9 commits April 1, 2024 08:52

Formatting cleanup

002aaeb

Cleanup

56bde74

Merge branch 'main' into PROD-1806-custom-identities

75e513d

Updating identity caching to support dicts

5e44592

Fixing cache tests

c763758

Removing unused import

20ba58d

Fixing test

369c9ba

Fixing integration test

d728357

Merge branch 'main' into PROD-1806-custom-identities

c635afc

galvana changed the title ~~First pass of custom identities~~ Custom identities Apr 2, 2024

vercel bot deployed to Preview April 2, 2024 04:56 View deployment

Merge branch 'main' into PROD-1806-custom-identities

a71d715

vercel bot deployed to Preview April 2, 2024 17:25 View deployment

galvana added 2 commits April 2, 2024 11:55

Updating Privacy Center form to support custom identities

15e892d

Updating Admin UI to support new identity format

bdbfe8c

vercel bot deployed to Preview April 2, 2024 21:51 View deployment

galvana added 2 commits April 3, 2024 11:50

Updating pre_process_input_data to deduplicate results

1ce66d2

Additional front-end cleanup

dd38cb0

vercel bot had a problem deploying to Preview April 4, 2024 02:15 Failure

galvana added 3 commits April 3, 2024 20:47

Adding integration test for custom identities

24f3925

Fixing tests

2f6d878

Front-end fixes

afc550b

vercel bot deployed to Preview April 4, 2024 05:17 View deployment

Fixing smoke test

674dfe2

galvana added 4 commits April 4, 2024 14:10

Merge branch 'main' into PROD-1806-custom-identities

035d180

Adding test

6f19304

Updating Cookie House demo

db238f8

Adding extra identities to DSR package

8255b04

vercel bot deployed to Preview April 5, 2024 04:57 View deployment

galvana added 2 commits April 5, 2024 08:22

Fixing mypy errors

0f9716f

Merge branch 'main' into PROD-1806-custom-identities

910e9ae

vercel bot deployed to Preview April 5, 2024 15:25 View deployment

galvana added 2 commits April 8, 2024 07:58

Merge branch 'main' into PROD-1806-custom-identities

705ee4f

Codecov test

581e18a

adamsachs reviewed Apr 8, 2024

View reviewed changes

galvana added 3 commits April 8, 2024 10:50

Fixing privacy request form validation

8e1043c

Merge branch 'main' into PROD-1806-custom-identities

a2da376

Changes based on PR feedback

ae6e330

galvana commented Apr 8, 2024

View reviewed changes

Adding more tests

6ae9c8b

adamsachs approved these changes Apr 8, 2024

View reviewed changes

Fixing down rev

8e84a72

galvana requested a review from jpople April 9, 2024 17:50

jpople reviewed Apr 9, 2024

View reviewed changes

galvana added 4 commits April 9, 2024 13:16

Updating sample project to fix smoke test

4e42ce9

Fixing value fallback logic

ccd5374

Fixing fallback logic

d8dfdd9

Merge branch 'main' into PROD-1806-custom-identities

4b448f0

galvana requested a review from jpople April 9, 2024 22:39

jpople approved these changes Apr 10, 2024

View reviewed changes

Updating change log

98f7c26

galvana merged commit 3acd0ab into main Apr 10, 2024
46 checks passed

galvana deleted the PROD-1806-custom-identities branch April 10, 2024 05:53

	MultiValue = Union[Union[StrictInt, StrictStr], List[Union[StrictInt, StrictStr]]]
	MultiValue = Union[StrictInt, StrictStr, List[Union[StrictInt, StrictStr]]]

		('CH-1', 'Jane Customer', 100, 'Cookie Rookie'),
		('CH-2', 'John Customer', 200, 'Cookie Connoisseur');

	if value["value"] is not None
	if value.get("value") is not None

		def make_immutable(obj: Any) -> Any:
		if isinstance(obj, dict):

Custom identities #4764

Custom identities #4764

Conversation

galvana commented Mar 30, 2024 • edited Loading

Description Of Changes

Code Changes

Privacy Center

Admin UI

Identity management

Task execution

Steps to Confirm

Pre-Merge Checklist

vercel bot commented Mar 30, 2024 • edited Loading

cypress bot commented Mar 30, 2024 • edited Loading

Passing run #7150 ↗︎

Review all test suite changes for PR #4764 ↗︎

codecov bot commented Apr 1, 2024 • edited Loading

Codecov Report

adamsachs commented Apr 8, 2024

adamsachs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

galvana left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamsachs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

galvana commented Mar 30, 2024 •

edited

Loading

vercel bot commented Mar 30, 2024 •

edited

Loading

cypress bot commented Mar 30, 2024 •

edited

Loading

codecov bot commented Apr 1, 2024 •

edited

Loading